A Language-Based Approach to Categorical Analysis

نویسندگان

  • Alexander Marlow
  • Cameron Alexander Marlow
  • Brian K. Smith
چکیده

With the digitization of media, computers can be employed to help us with the process of classification, both by learning from our behavior to perform the task for us and by exposing new ways for us to think about our information. Given that most of our media comes in the form of electronic text, research in this area focuses on building automatic text classification systems. The standard representation employed by these systems, known as the bag-of-words approach to information retrieval, represents documents as collections of words. As a byproduct of this model, automatic classifiers have difficulty distinguishing between different meanings of a single word. This research presents a new computational model of electronic text, called a synchronic imprint, which uses structural information to contextualize the meaning of words. Every concept in the body of a text is described by its relationships with other concepts in the same text, allowing classification systems to distinguish between alternative meanings of the same word. This representation is applied to both the standard problem of text classification and also to the task of enabling people to better identify large bodies of text. The latter is achieved through the development of a visualization tool named flux that models synchronic imprints as a spring network. Thesis Advisor: Walter Bender Executive Director, MIT Media Laboratory The author gratefully thanks the Motorola Fellows Program for its support and input into the development of this research

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On minimal realization of IF-languages: A categorical approach

he purpose of this work is to introduce and study the concept of minimal deterministic automaton with IF-outputs which realizes the given IF-language. Among two methods for construction of such automaton presented here, one is based on Myhill-Nerode's theory while the other is based on derivatives of the given IF-language. Meanwhile, the categories of deterministic automata with IF-outputs and ...

متن کامل

On the effectiveness of integrated skills approach in language teaching: a meta-analysis

This meta-analysis was conducted to synthesize the effect of 22 primary studies which have been conducted to test the effect of the integrated skills approach (ISA) on language skills and components. Three questions guide this analyses: What is the overall effect of ISA  on language skills and sub-skills? To what extent moderator variables such as learners level of education and proficiency mod...

متن کامل

A Stylistic and Proficiency-based Approach to EFL Learners’ Performance Inconsistency

Performance deficiencies and inconsistencies among SLA or FL learners can be attributed to variety of sources including both systemic (i.e., language issues) and individual variables.  Contrary to a rich background, the literature still suffers from a gap as far as delving into the issue from language proficiency and learning style is concerned. To fill the gap, this study addressed EFL learner...

متن کامل

Town trip forecasting based on data mining techniques

In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests f...

متن کامل

Development and Validation of Teacher Emotional Support Scale: a structural equation modeling approach

Reviewing the literature indicated that no validated model was found that examine the extent to which teachers support their students emotionally in EFL classrooms. Therefore the present study elaborated on this issue through developing and validating a teacher emotional support scale in an Iranian English foreign language context. Main components of the scale have been specified based on Hamre...

متن کامل

A Model of Iranian EFL Learners\' Cultural Identity: A Structural Equation Modeling Approach

This study aimed, firstly, to investigate the underlying components of Iranian cultural identity and, secondly, to confirm the aforementioned components via Structural Equation Modeling (SEM) analysis. In order to achieve these goals, the researchers reviewed the extensive local and international literature on language, culture and identity. Based on the literature and consultations with a grou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000